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Purpose: Recent trends have encouraged the use of 
alternative assessment tools in class in line with the 
recommendations made by the updated curricula. It 
is of great importance to understand how alternative 
assessment affects students' academic outcomes and 
which techniques are most effective in which 
contexts. This study aims to examine the impact of 
alternative assessment techniques on achievement. 
Research Methods: In the study, a meta-analysis was 
conducted to combine the effect sizes of the primary 
studies during data collection and data analysis. 
Findings: Data analysis indicated that alternative 
assessment techniques have a significant and positive 
effect (d=0.84) on students' 

academic achievement. Such techniques have been found to be more effective in Mathematics 
courses (d=0.84), and the effect of using portfolios in class (d=1.01) is worthy of note. In 
accordance with the moderator analysis, whereas the effect sizes do not significantly vary in 
terms of subject matter and type of alternative assessment technique, there is a significant 
difference in the effect sizes in terms of school levels of students. 

Implications for Research and Practice: The results highlighted portfolios as a highly effective 
assessment technique for students' academic achievements, and it revealed the impact of 
alternative assessment techniques on enhancing academic outcome. However, the low 
effectiveness of authentic assessment at the primary level may be associated with the 
development of creativity and critical thinking skills over time. 
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Introduction 

A system based on the constructivist approach has been introduced with the 
education reform implemented in primary and secondary curricula in Turkey since 
2004. The newly-developed curricula based on this approach have broken new ground 
in course content, teaching methods, materials and measurement and evaluation 
techniques (Gelbal and Kelecioglu, 2007; Yesilyurt, 2012). It appears that the most 
important innovation in these programs, which emphasize the individual differences 
between the learners, is in the field of evaluation (Coruhlu, Nas and Cepni, 2009; 
Yaman, 2011). With this change in the curricula, the use of performance-based 
alternative assessment tools as well as traditional assessment techniques has been 
suggested (Duban and Kucukyilmaz, 2008; Ozdemir, 2010). In this way, it has become 
important to evaluate students' skills and success from all aspects during the learning 
process and to observe their improvement. 

Alternative assessment is defined as a non-traditional approach that informs 
students about what they know and can do, determines what they comprehend about 
the subject, and evaluates their performance (Gummer and Shepardson, 2001). 
Alternative assessment with reliable, performance-based, realistic, constructivist and 
feasible features includes activities in which knowledge and skill are connected and 
knowledge is acquired in different learning environments. It teaches students to be 
aware of their own ideas and to evaluate themselves by allowing students to analyze 
their own learning styles. In other words, alternative assessment provides flexible and 
meaningful learning experiences that take into consideration the learning styles of the 
students. From this aspect, it may be distinguished from standardized assessment 
techniques (Korkmaz, 2006). 

Alternative assessment techniques enable students to be evaluated multi- 
dimensionally as they offer students multiple evaluation opportunities during which 
to display their knowledge, skills and attitudes (MEB, 2005). Additionally, alternative 
assessment assists teachers in creating a motivating learning environment that fits each 
student's learning needs and learning style, follows individual student achievement, 
and creates an atmosphere that takes into consideration students' self-assessment of 
their own learning process (Greenstein, 2010). 

There have been many primary studies on alternative assessment techniques in 
Turkey. Even though the frequency of use of these techniques differs according to the 
subject matter (Yazici and Sozbilir, 2014), the studies have asserted that portfolios, peer 
assessment, diagnostic-branched trees, structured grids (Buyuktokatli and Bayraktar, 
2014; Yazici and Sozbilir, 2014) and self-assessment (Karakus, 2010; Kosterelioglu and 
Celen, 2016) are the least-used techniques. However, the literature focuses on the 
positive effects of these techniques on students. 

It is important that teachers use the assessment techniques recommended in their 
curricula to evaluate their students and their teaching activities. The current 
curriculum suggests that learners should be assessed in a way that will open up all¬ 
round and high-level thinking skills, and for this it provides teachers with assessment 
tools to evaluate students from every aspect. However, the studies (Dos, 2016; Gerek, 
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2006; Gelbal and Kelecioglu, 2007) show that the teachers who are unfamiliar with the 
curricula act with suspicion towards these techniques and view them as difficult to 
apply. 

Many studies on the effectiveness of alternative assessment techniques have been 
carried out. However, no study found in either national or international literature 
examined the effects of alternative assessment techniques on a large scale or 
determined which techniques proved most effective on achievement. Accordingly, this 
study was designed to review the literature regarding alternative assessment that has 
recently gained popularity in Turkey. Data were derived from the primary studies, 
and the findings were combined through a meta-analysis underlying this research. 
Thus, calculating the effect size of the primary studies, which have investigated the 
impact of alternative assessment on academic outcomes, allows for the discussion of 
which assessment techniques are most effective. 

In light of these facts, and seeing the need for this extensive review in the Turkish 
assessment context, the following research questions were designed for the present 
meta-analysis: 

1. What are the effects of alternative assessment techniques on student 
achievement? 

2. How do various alternative assessment techniques (e.g., portfolio, self- 
assessment) moderate the overall average effect size? 

3. How do demographic features of the studies (i.e., subject matter and school 
level) moderate this overall effect? 

Method 


Research Design 

The current study primarily aimed to examine the impact of alternative assessment 
techniques on academic achievement. In line with this purpose, a meta-analysis 
method was applied in this study. Meta-analysis is a statistical procedural method 
used to interpret, synthesize and combine the experimental findings of the primary 
studies on specific research (Wolf, 1986). This study was designed around Cooper's 
easy-to-follow seven steps for conducting a systematic review; (1) formulating the 
problem, (2) searching the literature, (3) gathering information from studies, (4) 
evaluating the quality of studies, (5) analyzing and integrating the outcomes, (6) 
interpreting the data, and (7) presenting the results (Cooper, 2010). 

Research Instruments and Procedures 

Based upon the problem of this research, extensive literature review was designed 
to identify the primary studies. Key words used in this review primarily consisted of 
“alternative assessment”, “portfolio”, “grid”, “diagnostic tree”, “peer assessment”, 
“self-assessment” and their variations in Turkish. The following electronic databases 
were among the sources examined: CoHE National Dissertation Center, ERIC, 
PsycINFO, ASOS social sciences index and many journals of Education Faculties in 
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Turkey, in addition to Google web and scholar in search of conference proceedings. 
The primary studies were collected by regularly reviewing the databases up to August 
2014, and they were selected for inclusion in the analysis. To be included, a study had 
to meet the following criteria: 

• Address the impact of alternative assessment techniques on students' 
achievements, 

• Contain at least two independent samples, with pretest-posttest experimental 
or quasi-experimental design, 

• Contain sufficient statistical information to extract effect size, 

• Be administered in Turkey, 

• Be published between 2004 and August 2014. 

As the sampling of a study must consist of at least 10 students for each group to 
ensure the approximate normal distribution of Cohen's d effect size (Hedges and 
Olkin, 1985), the studies carried out with smaller samples were not included in this 
analysis. In light of Lipsey and Wilson's (2001) suggestions, a coding form which 
included both statistical and theoretical data was developed with regard to 
transforming the features of all studies included in this meta-analysis into the 
categorical variables. 

For the interrater reliability of the coding form, about 25% (n=6) of the included 
articles were randomly selected, and they were independently rated and coded by two 
researchers. The forms were compared using the [agreement / (agreement + 
disagreement) x 100] formula (Miles and Huberman, 1994), and the reliability of 
intercoders was determined to be 98%. The disagreements were discussed until they 
were solved and corrected on the form. 

Research Sample 

Subsequent to coding the studies, out of 172 theses and dissertations, 68 articles 
and conference papers, 26 studies (36 effect sizes) which met the criteria were 
identified as the sample of this meta-analytic study. 
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Figure 1. Flow chart of literature review 


Data Analysis 

The current study uses 'study effect' meta-analysis for the analysis of the data. This 
method is used for group differences that occur when the arithmetical mean values of 
the dependent variables of each study included in meta-analysis were not obtained 
using the same scale (Lipsey and Wilson, 2001; Cohen, 1992). The aim of this method 
is to calculate the difference between the mean values of the control and experimental 
groups in experimental studies, represented by the formula d= (Xe-Xc)/SD (Hunter 
and Schmidt, 2004). The "d" value obtained represents the effect size and forms the 
basis for meta-analysis. In this study, the experimental group is the group to which 
one of the alternative assessment techniques was administered, and the control group 
is the one which was assessed in a traditional way. As a result, if the calculated effect 
size is positive, it is interpreted to be effective for alternative assessment or, if it is 
negative, to be effective for traditional assessment. 

According to Wolf (1986), if the effect sizes of a range of independent studies are 
statistically significant (homogeneous), these studies may be stated to test the same 
hypothesis. In this case, if they are heterogeneous (statistically insignificant), it is 
conceivable whether each study tests the same hypothesis or not. In this paper, after 
extracting the effect size of each study, Q statistic suggested by Cochran was used to 
test the homogeneity of effect sizes. Under the fixed effect model, it was revealed that 
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Q value exceeded critical value. For this reason, the analysis was carried out again 
under the random effect model. I 2 test was also used to determine the degree of 
heterogeneity. Moderator variables were analyzed to explain the basis of 
heterogeneity. Comprehensive Meta-Analysis V2 (CMA) Software was used for all 
data analysis. 

Results 

In this paper examining the impact of alternative assessment techniques on student 
academic achievement, the characteristic features and effect sizes of the studies have 
been determined by studying the samples, standard deviations and means of 26 
studies. The number of students in the studies included in the meta-analysis is 2256, 
1120 of which are in the experimental groups and 1136 of which are in the controlled 
groups. Descriptive features of the studies included in the analysis are presented in 
Table 1. 


Table 1 


Descriptive Analysis of the Included Studies in Terms of Variables 


Variables 

Frequency 

(f) 

Percentage 

(%) 

School level 

Primary 

15 

57.5 

Secondary 

7 

26.9 

Undergraduate 

4 

15.4 

Subject matter 

Science and Technology 

12 

46.2 

Math 

3 

11.5 

English 

4 

15.4 

Other 

7 

26.9 

Alternative Assessment Technique 

Self-assessment 

1 

2.7 

Peer assessment 

2 

5.6 

Self- and peer assessment 

3 

8.3 

Portfolio 

24 

66.7 

Grid 

2 

5.6 

Diagnostic branched tree and structured grid 

4 

11.1 


As seen in Table 1, most studies were carried out at the primary level (57.5%), and 
the least were at the undergraduate level (15.4%). Twelve studies (46.2%) were 
conducted in Science and Technology courses. Portfolios (66.7%) represented the 
most-used technique in the included studies. Of seven studies in which more than one 
assessment technique was used, four studies (11.1%) made use of diagnostic branched 
tree and structured grid, three studies (8.3%) used self- and peer assessment 
techniques together. 
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To find the answer to the first research question, 'What is the impact of alternative 
assessment techniques on students' achievement?', the studies included in this meta¬ 
analysis were integrated together with standard error and variation in the common 
effect size. Figure 2 shows the descriptive statistics associated with 36 effect sizes from 
26 studies. The study names are presented on the left of the figure. The statistics for 
these 36 effect sizes, such as Hedges g, the standard error and the variance are placed 
in the center. On the right side of the figure, a graphic called a 'Forest plot' is presented. 
The effect size for each study is illustrated as a dot. The lines display the width of the 
confidence interval for each study. Confidence intervals spanning 0.0 on the 
distribution are considered to be insignificantly different from zero. 


Studyname Statistics for each study 



Hedges's 

g 

Standard 

error 

Variance 

YirdabatemeCihanoglu-a 

0,633 

0,334 

0,112 

Yu'dabaten\eCiharcglu-b 

0,939 

0,344 

0,118 

Clgm 

1,053 

0,259 

0,067 

K^aa 

0,407 

0,365 

0,133 

K^ab 

0,440 

0,344 

0,118 

Cbcan 

-0,200 

0,171 

0,029 

KiriMasaieVurtejaa 

0,792 

0,301 

0091 

KiriM^a\e Vurteja-b 

0,838 

0,309 

0096 

Kiri Map leVurta^ac 

0,620 

0,310 

0096 

Bagd 

0,104 

0,318 

0,101 

lzgi-a 

2327 

0,405 

0,154 

Izgi-b 

0,922 

0,402 

0,162 

Dogan 

0,214 

0,126 

0,016 

Yuttas 

1,070 

0,273 

0,074 

Mems 

0,018 

0,245 

0060 

Tuan 

0,677 

0,319 

0,102 

Orel-a 

-0,120 

0,816 

0,119 

Qrel-b 

0,315 

0,359 

0129 

Merase 

13,058 

0,798 

0,637 

Baris 

0,607 

0,144 

0021 

Koroglu-a 

1,382 

0,284 

0081 

Kcjcglu-b 

1,018 

0,271 

0,074 

Kbc 

0,530 

0,242 

0059 

Balaban-a 

-0,309 

0,226 

0051 

Balaban-b 

0,388 

0,213 

0046 

Balaban-c 

0,772 

0,369 

0,136 

Balaban-d 

1,179 

0,467 

0,218 

Anahtard 

1,333 

0,353 

0,125 

Cbek 

0,878 

0,362 

0,131 

Karamanoglu 

-1,170 

0,336 

0,113 

Mihladz 

0,512 

0,189 

0036 

Ocu 

0,218 

0,351 

0,123 

Rarlatyldz 

1,969 

0,335 

0,112 

Gingor 

1,090 

0,305 

0093 

Erdqgan 

0,198 

0,297 

0088 

GuiemeAybgdu 

0,698 

0,253 

0069 


0,812 

0,154 

0024 


Hedges’s g and 95% Cl 



Figure 2. Forest plot of meta-analysis and study-level statistics 
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As reflected by the Forest plot in Figure 2, the studies with the smallest confidence 
interval were Koc's (2010) and Dogan's (2012), whereas the one with the widest 
confidence interval was Menevse's (2012). Thirty-two effect sizes from the included 
studies were classified as positive; that is, 88.88% of the effect sizes reveal that the 
results are in favor of alternative assessment techniques. The summary statistics 
derived from 36 effect sizes are presented in Table 2. 


Table 2 


Overall Weighted Average Random Effects and Fixed Effect Sizes and Homogeneity 
Statistics 


Analytical 

N 

Effect 

df 

Q-total 

P 

95% Confidence interval 

models 

size 

Lower 

Upper 

Fixed effect 

36 

36 

0.550 

35 

397.980 

91.2 

0.463 

0.637 

Random effects 

0.842 

0.540 

1.144 


Table 2 shows a fixed weighted average effect of g=0.550 and a random weighted 
average effect of g=0.842. Both the fixed and random weighted effect sizes are 
significantly greater than zero. The effect size is considered large by Cohen's 
standards. The Q statistics show that the distribution is significantly heterogeneous, 
and I-squared indicates that over 75% of variability in the distribution is between- 
study variance. Namely, variability in effect sizes exceeds sampling error. To explain 
this heterogeneity, moderator analysis was carried out. 

In order to find an answer to the second research question, 'how do various 
alternative assessment techniques moderate the overall weighted effect size?', the 
included studies were classified into five categories in terms of alternative assessment 
techniques, such as peer assessment, self- and peer assessment, grid, portfolio and DBT 
and SG (diagnostic branched tree and structured grid). In accordance with these 
categories, the findings are presented in Table 3. 


Table 3 


Moderator Analysis of Various Alternative Assessment Techniques (AAT) 


Variable 

k 

Effect 

size 

95 % Confidence Interval 
Lower Upper 

Qb 

df 

P 

AAT 

35 



2.241 

5 

0.210 


Peer assessment 

2 

0.423 

-0.998 

1.844 

Self- and peer 
assessment 

3 

0.877 

-0.268 

2.023 

Grid 

2 

0.629 

-0.736 

1.994 

Portfolio 

24 

1.012 

0.604 

1.420 

DBT and SG 

4 

0.501 

-0.481 

1.482 
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As seen in Table 3, the learning environments in which portfolios are used have 
the largest effect size (d=1.012) / and those with peer assessment have the smallest effect 
size (d=0.423). As Q value is smaller than the critical value (Qb< x 2 ; p>.05), the Q- 
between is not significant for this variable, indicating that within chance they are 
equal. 

The third research question, 'how do demographic study features moderate this 
effect size?' was formed to determine whether there is a significant difference between 
the effect sizes in terms of subject matter and study level. For subject matter analysis, 
some studies were excluded in this analysis, especially in subjects such as Computer, 
Chemistry, Social Science and Environmental Science, on which there are fewer 
studies. The findings are shown in Table 4. 


Table 4 

Moderator Analysis of Demographic Study Features 




Effect 

95 % Confidence 




Variables 

k 

Interval 

Qb 

df 

P 



size 

Lower 

Upper 




Subject matter 

29 




2.661 

2 

0.264 

Science and 
Technology 

20 

0.505 

0.260 

0.751 




English 

6 

0.861 

0.409 

1.313 




Math 

3 

0.905 

0.251 

1.559 




School level 

36 




26.069 

2 

0.000 

Primary 

23 

0.549 

0.176 

0.922 




Secondary 

4 

3.137 

2.205 

4.069 




Undergraduate 

9 

0.648 

0.059 

1.237 





According to the findings in Table 4, the studies conducted in Math demonstrated 
the largest effect size (d=0.905), and those in Science and Technology showed the 
smallest effect (d=0.505). However, as the Q statistical value indicates, the distribution 
of effect sizes is found to be homogenous. In other words, there is no significant 
difference in effect size in terms of subject matter (Qb=2.661; p= 0.264). The findings 
concerning school level show that the largest effect has been found at the secondary 
level (d=3.137), while the smallest effect is at the primary level (d=0.549). As the Q 
value exceeds the critical value with two degree of freedom, the distribution of the 
effect sizes is seen as heterogeneous (Qb=26.069, p=0.000). Accordingly, the effect of 
alternative assessment techniques on academic achievement significantly varies by 
school level. 

Publication Bias 

In order to examine the publication bias, a funnel plot was drawn in Figure 3. As 
seen below, it is generally symmetrical around the mean of distribution. Accordingly, 
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there is no publication bias comprising the results of this meta-analytic review. To 
support this, upon analyzing Rosental's fail-safe N, it has been found out that the fail¬ 
safe N is 2048, based on 36 effect sizes from 26 studies, with a z value of 14.90 and 
corresponding p-value of 0.00. What this means is that this analysis must include 2048 
'null' studies for p-value to exceed .05; that is, 56.8 missing studies would be required 
for each effect size to equate to 'zero'. 



-20 -10 o 10 20 


Hedges’s g 

Figure 3. Funnel plot with effect sizes (horizontal axis) and standard errors (vertical 
axis) 


Discussion and Conclusion 

In accordance with 36 effect sizes derived from 26 studies conducted in Turkey, it 
has been revealed that AAT has a positive impact on academic achievement, and this 
effect has been classified as large by Cohen's standards. It has been concluded that 
AAT is significantly impact on student achievement. This result is suggestive enough, 
in addition to being congruent with many studies (Anahtarci, 2009; Bagci, 2009; Baris, 
2011; Barootchi and Keshavarz, 2002; Fenwick and Parsons, 1999; Gungor, 2005; Gurel, 
2013; Guven and Aydogdu, 2009; Izgi, 2007; Kirikkaya and Vurkaya, 2011; Koroglu, 
2011; Memis, 2011; Menevse, 2012; Olgun, 2011; Ozek, 2009; Parlakyildiz, 2008; Turan, 
2013). 

This meta-analysis examined whether the estimated effect size varies in terms of 
various alternative assessment techniques, subject matter and school level. Moderator 
analysis of various alternative assessment techniques revealed that the studies 
conducted using portfolios in class have the largest effect, the ones using self- and peer 
assessment combined have a larger effect, and those using only peer assessment 
technique have the smallest effect. However, the results show that the effect of various 
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alternative assessment techniques insignificantly varies. As a consequence of this 
meta-analysis, it has been found that portfolios are most frequently used in the 
primary grades and represent a larger contribution to the weighted average effect size 
than the other techniques (Anahtarci, 2009; Gungor, 2005; Karamanoglu, 2006; 
Mihladiz, 2007; Okcu, 2007; Ozek, 2009; Parlakyildiz, 2008). In this sense, it may be 
considered that the effect of other techniques on achievement is of importance and 
should be further explored. 

The other moderator analysis was carried out on descriptive subject features such 
as subject matter and school level in which the primary studies were conducted. In 
terms of subject matter, the results demonstrated that the treatments in Mathematics 
courses have larger effect size, while those in Science and Technology courses have 
relatively low effect size. However, based on the findings, it has been stated that the 
effect of AAT on achievement does not differ in terms of subject matter. As for school 
level, the results show that interventions in the secondary schools have a large effect 
size, whereas those in the primary schools have a moderate effect size. On the other 
hand, it has been revealed that there is a significant difference in effect sizes in terms 
of school level, and the impact of AAT on achievement differs with regard to the school 
level. Winking (1997) stated that alternative assessment requires upper cognitive skills, 
so students can solve real-life problems. Additionally, it is known that what is effective 
in alternative assessment is that critical thinking and creativity develop over time (Eva, 
Cunnington, Reiter, Keane and Norman, 2004). 

It is essential that meta-analytical results be interpreted with consideration to some 
of the limitations of primary studies. Some factors such as the experiment period, the 
experimenter's characteristics, and the difficulties in the experiments likely affect the 
results. According to Corcoran, Dershimer and Tichenor (2004), even though many 
teachers agree on the importance of using any kind of alternative assessment 
techniques, they state that it is difficult to administer them to the students. 

In the current meta-analysis, the effect of alternative assessment techniques has 
been examined only in regard to student academic achievement. The effect of AAT on 
attitudes, anxiety and motivation may be investigated in future meta-analytic studies. 
In the literature review for this meta-analysis, it has been noted that there is a lack of 
study in some subject matter areas. Accordingly, the comparison of effect sizes in 
terms of subject matter has fallen short. More experimental / quasi-experimental 
studies may be conducted in other subject matters such as Turkish Language, History, 
and Chemistry. Considering the limited studies on AAT conducted in Turkey, a new 
meta-analytic study may be designed, including the studies on AAT from other 
countries. 
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Ozet 

Problem Durumu: Ogretmenlerin, ogrencilerini ve kendi ogretim faaliyetlerini 
degerlendirmede ogretim programlarmda tavsiye edilen olgme degerlendirme 
tekniklerini kullanmalan onem ta§imaktadir. Tiirkiye'de 2004 egitim reformu ile 
yenilenmekte olan ogretim programlari ogrencilerin, gok yonlti ve list dtizey du§tinme 
becerilerini agiga gikanci bir §ekilde degerlendirilmesi gerektigini one stirmekte ve 
geleneksel degerlendirme yontemleriyle birlikte alternatif olgme ve degerlendirme 
tekniklerinin kullanilmasmi onermektedir. Bu olgme araglarmm etkililigi programa 
yabanci olan ve bu araglari kullanan ogretmenler tarafmdan §tiphe ile kar§ilanmakta 
ve bu durumda araglarm uygulanmasi da oldukga gugle§mektedir. Alternatif 
degerlendirme tekniklerinin etkililigi ile ilgili gtintimuze kadar pek gok ara§tirma 
yapildigi ve hala yapilmakta oldugu gortilmektedir. Fakat hem alternatif 
degerlendirme tekniklerinin etkililigini geni§ olgekte ortaya koyan hem de hangi 
degerlendirme tekniginin akademik ba§ari tizerinde daha etkili oldugrmu gosteren bir 
gali§maya alan yazmda rastlanmami§tir. Bu gali§ma, son on yilda tilkemizde yapilan 
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program degi§ikligi sebebiyle poptilerligi gittikge artan alternatif degerlendirme ile 
ilgili alan yazmi gozden gegirmek igin planlanmi§tir. Bireysel ara§tirmalardan elde 
edilen veriler ve bulgularm meta-analiz yontemiyle birle§tirilmesi ara§tirmanm 
temelini olu§turmaktadir. Bu sayede alternatif yontemlerle degerlendirilen akademik 
ba§armm etki btiytikltigtine ula§ilmasi ve hangi degerlendirme tekniginin daha etkili 
oldugunu tarti§maya imkan saglayabilecektir. 

Ara§tirmanin Amaci: Bu ara§tirma, alternatif degerlendirme tekniklerinin 
ogrencilerin akademik ba§arilarma etkisini ve akademik ba§annm kullamlan alternatif 
degerlendirme teknigi ttirlerine, teknigin uygulandigi ders ttirtine ve ogretim 
kademesine gore farklila§ip farklila§madigmi meta-analiz yontemiyle ara§tirmayi 
hedeflemi§tir. 

Ara§tirmanin Yontemi : Bu gali§mada alternatif degerlendirme tekniklerinin 
akademik ba§an tlzerindeki etkililigi tizerine yapilmi§ birincil gali§malarm etki 
buytikluklerinin hesaplanmasi, birle§tirilmesi ve yorumlanmasi amaciyla meta-analiz 
yontemi kullanilmi§tir. Bu anlamda oncelikle ilgili birincil gali§malara ula§mak igin 
Ttirkiye'de alternatif degerlendirme tekniklerinin egitim programlarma dahil edildigi 
yil olan 2004 yili itibariyle yapilmi§ gali§malarm alan yazm taramasi yapilmi§tir. Bu 
taramada "alternatif degerlendirme", "portfolyo", "yapilandirilmi§ grid", "tanilayici 
dallanmi§ agag", "akran degerlendirme", "oz degerlendirme" gibi anahtar sozctikler 
ile YOK Ulusal Tez Merkezi, ERIC, PsycINFO, ASOS sosyal bilimler indeksi gibi veri 
tabanlari ile iiniversitelerin egitim faktiltesi dergileri taranmi§tir. Ilgili alan yazm 
gali§masmdan sonra (1) alternatif degerlendirme tekniginin ogrencilerin akademik 
ba§ansi tizerindeki etkisini inceleyen, (2) on-test son-test deneysel ya da yari deneysel 
en az iki bagimsiz omeklem igeren, (3) etki btiyukltigti hesaplamalari igin gereken 
istatistiksel veriler igeren, (4) alternatif degerlendirme teknigini Ttirkiye'de uygulami§ 
olan ve (5) 2004 - Agustos 2014 yillari arasmda yapilmi§ olan 26 gali§ma analize dahil 
edilmi§tir. Bu gali§malar ara§tirmaci tarafmdan geli§tirilen ve % 98 oranmda 
degerlendiriciler arasi gtivenirligin tespit edildigi bir kodlama formuna i§lenmi§tir. . 
Bu gali§mada her gali§manm etki buyukltigti hesaplandiktan sonra etki 
buytikluklerinin homojenligi testi igin Cochran tarafmdan onerilen (k-1) serbestlik 
dereceli Ki-Kare heterojenlik testi (Q istatistigi) kullamlmi§tir. Heterojenlik 
derecesinin belirlenmesi igin ise I 2 testi segilmi§tir. Meta-analizde sabit etkiler modeli 
uygulandigmda elde edilen; grup igi, gruplar arasi ve toplam heterojenlik degerlerinin 
kritik degerlerden yiiksek oldugu ortaya gikmi§tir. Bu nedenle rastgele etkiler modeli 
kullamlarak etki buytikltikleri tekrar hesaplannu§tir. Heterojenligin kaynagmm tespiti 
igin ise bazi kategorik degi§kenler igin moderator analizi yapilmi§tir. Yaym yanliligm 
olup olmadigmm test edilmesi igin huni diyagrami olu§turulmu§ ve Rosental'm 
korumali N testi ile sonug desteklenmi§tir. 

Ara§tirmanm Bulgulan: (Jali§malarm etki buytikluklerinin heterojen yapida (Q>% 2 , 
p< 0.05) ve gali§malar arasmdaki heterojenlik miktarmm (I 2 =91) ytiksek gikmasi 
sonucunda yapilan moderator analizi heterojenlik kaynagmm birincil gali§malarm 
yapildigi ogrenim kademeleriyle ili§kili oldugu gortilmti§tiir. Diger bir deyi§le etki 
buytikltikleri gali§malarm yapildigi ders ttirlerine gore ve gali§malarda kullamlan 
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alternatif degerlendirme teknikleri ttirlerine gore farklila§mazken ogrencilerin 
ogrenim kademelerine gore etki btiytikltikleri arasmda anlamli bir farklilik oldugu 
ortaya gikmi§tir. Bulgular alternatif degerlendirme tekniklerinin ogrencilerin 
akademik ba§arisi tizerinde pozitif ve yuksek dtlzeyde bir etkiye (d=0.84) sahip 
oldugunu ortaya koymu§tur. Ayrica, bu tekniklerin kullanilmasi ogrencilerin 
matematik dersindeki akademik ba§arilari tizerinde geni§ bir etkiye sahip oldugu (.90) 
ve portfolyo kullanimmm da (d=1.01) etkisinin kayda deger oldugu sonucuna 
ula§ilmi§tir. Yapilan yaym yanliligi analizi sonucunda, bu meta-analiz bulgularmi 
garpitacak bir yanliligm olmadigi elde edilen degerlerin yuksek gtivenirlikte oldugu 
belirlenmi§tir. 

Aragtirmamn Sonuglan ve Onerileri: Mevcut ara§tirmanm sonuglari, alternatif 
degerlendirme tekniklerinin ogrencilerin akademik ba§arisi agismdan geleneksel 
degerlendirme tekniklerine oranla daha ba§anli oldugunu gostermi§tir. Buna ek 
olarak, alternatif degerlendirmenin etkililigi kullanilan degerlendirme tekniklerine ve 
ders ttirlerine gore anlamli farklilik gostermediginden farkli otantik degerlendirme 
tekniklerinin derslerde kullanilmasi ogrenci ba§ansmi artirdigi soylenebilir. Ancak, bu 
tekniklerin etkililigi kullanildigi ogrenim kademelerine gore anlamli fark gosterdigi 
sonucuna ula§ilmi§tir. Alternatif degerlendirmeler daha tist dti§iinme becerileri 
gerektirdigi igin alt kademlerde kullanimmda daha az etkili oldugu sonucuna 
varilmi§tir. Bu tekniklerin ozellikle ortaogretim ya da ytiksekogretimde 
kullamlmasmm daha etkili olacagi soylenebilir. Bu meta-analiz gali§mada alternatif 
degerlendirme tekniklerinin etkililik dtizeyi sadece akademik ba§ari agismdan 
incelenmi§tir. Yapilacak diger meta-analiz gali§malarmda alternatif degerlendirme 
tekniklerinin tutum, kaygi ve motivasyon dtizeyi agismdan etkililigi ara§tirilabilir. 
((lali§mada meta-analiz igin yapilan alan taramasmda bazi derslere yonelik yeterli 
gali§maya ula§ilamann§tir. Dolayisiyla ders ttirlerine gore kar§ila§tirmalarda bazi 
boyutlar eksik kalmi§tir. Buna gore diger dersler igin yeni ara§tirmalar yapilmasi alan 
yazma katki saglayabilir. 

Anahtar Kelimeler: Otantik degerlendirme, portfolyo, performans, etki btiytikltigti. 




